Goto

Collaborating Authors

 Pampas


Artificial Intelligence and Deepfakes: The Growing Problem of Fake Porn Images

Der Spiegel International

In San Francisco, meanwhile, a lawsuit is underway against the operators of a number of nudify apps. In some instances, the complaint identifies the defendants by name, but in the case of Clothoff, the accused is only listed as "Doe," the name frequently used in the U.S. for unknown defendants. According to the website's imprint, Clothoff is operated out of the Argentinian capital Buenos Aires. But the company has concealed the true identities of its operators through the use of shell companies and other methods. For a time, operators even sought to mislead the public with a fake image, presumably generated by AI, of the purported head of Clothoff.


Reboot of Buenos Aires facial recognition plan fuels privacy fears

The Japan Times

After a relaxing weekend away, Guillermo Ibarrola was walking out of a train station in Argentina's capital when police arrested him and accused him of a robbery committed hundreds of miles away in a place he had never visited. "It was a nightmare," Ibarrola told local media after the 2019 incident, which rights campaigners say highlights the risks of using facial recognition systems to survey populations. The system of 300 cameras linked to a national crime database -- dubbed Buenos Aires' Big Brother -- was suspended two years ago after a court found it may have been used to collect data on journalists, politicians and human rights activists, and ruled it unconstitutional.


Feasibility of machine learning-based rice yield prediction in India at the district level using climate reanalysis data

arXiv.org Artificial Intelligence

Yield forecasting, the science of predicting agricultural productivity before the crop harvest occurs, helps a wide range of stakeholders make better decisions around agricultural planning. This study aims to investigate whether machine learning-based yield prediction models can capably predict Kharif season rice yields at the district level in India several months before the rice harvest takes place. The methodology involved training 19 machine learning models such as CatBoost, LightGBM, Orthogonal Matching Pursuit, and Extremely Randomized Trees on 20 years of climate, satellite, and rice yield data across 247 of Indian rice-producing districts. In addition to model-building, a dynamic dashboard was built understand how the reliability of rice yield predictions varies across districts. The results of the proof-of-concept machine learning pipeline demonstrated that rice yields can be predicted with a reasonable degree of accuracy, with out-of-sample R2, MAE, and MAPE performance of up to 0.82, 0.29, and 0.16 respectively. These results outperformed test set performance reported in related literature on rice yield modeling in other contexts and countries. In addition, SHAP value analysis was conducted to infer both the importance and directional impact of the climate and remote sensing variables included in the model. Important features driving rice yields included temperature, soil water volume, and leaf area index. In particular, higher temperatures in August correlate with increased rice yields, particularly when the leaf area index in August is also high. Building on the results, a proof-of-concept dashboard was developed to allow users to easily explore which districts may experience a rise or fall in yield relative to the previous year.


Improvement and generalization of ABCD method with Bayesian inference

arXiv.org Artificial Intelligence

To find New Physics or to refine our knowledge of the Standard Model at the LHC is an enterprise that involves many factors. We focus on taking advantage of available information and pour our effort in re-thinking the usual data-driven ABCD method to improve it and to generalize it using Bayesian Machine Learning tools. We propose that a dataset consisting of a signal and many backgrounds is well described through a mixture model. Signal, backgrounds and their relative fractions in the sample can be well extracted by exploiting the prior knowledge and the dependence between the different observables at the event-by-event level with Bayesian tools. We show how, in contrast to the ABCD method, one can take advantage of understanding some properties of the different backgrounds and of having more than two independent observables to measure in each event. In addition, instead of regions defined through hard cuts, the Bayesian framework uses the information of continuous distribution to obtain soft-assignments of the events which are statistically more robust. To compare both methods we use a toy problem inspired by $pp\to hh\to b\bar b b \bar b$, selecting a reduced and simplified number of processes and analysing the flavor of the four jets and the invariant mass of the jet-pairs, modeled with simplified distributions. Taking advantage of all this information, and starting from a combination of biased and agnostic priors, leads us to a very good posterior once we use the Bayesian framework to exploit the data and the mutual information of the observables at the event-by-event level. We show how, in this simplified model, the Bayesian framework outperforms the ABCD method sensitivity in obtaining the signal fraction in scenarios with $1\%$ and $0.5\%$ true signal fractions in the dataset. We also show that the method is robust against the absence of signal.


MosquIoT: A System Based on IoT and Machine Learning for the Monitoring of Aedes aegypti (Diptera: Culicidae)

arXiv.org Artificial Intelligence

Millions of people around the world are infected with mosquito-borne diseases each year. One of the most dangerous species is Aedes aegypti, the main vector of viruses such as dengue, yellow fever, chikungunya, and Zika, among others. Mosquito prevention and eradication campaigns are essential to avoid major public health consequences. In this respect, entomological surveillance is an important tool. At present, this traditional monitoring tool is executed manually and requires digital transformation to help authorities make better decisions, improve their planning efforts, speed up execution, and better manage available resources. Therefore, new technological tools based on proven techniques need to be designed and developed. However, such tools should also be cost-effective, autonomous, reliable, and easy to implement, and should be enabled by connectivity and multi-platform software applications. This paper presents the design, development, and testing of an innovative system named MosquIoT. It is based on traditional ovitraps with embedded Internet of Things (IoT) and Tiny Machine Learning (TinyML) technologies, which enable the detection and quantification of Ae. aegypti eggs. This innovative and promising solution may help dynamically understand the behavior of Ae. aegypti populations in cities, shifting from the current reactive entomological monitoring model to a proactive and predictive digital one.


Hyperparameter optimization of hp-greedy reduced basis for gravitational wave surrogates

arXiv.org Artificial Intelligence

In a previous work we introduced, in the context of gravitational wave science, an initial study on an automated domain-decomposition approach for reduced basis through hp-greedy refinement. The approach constructs local reduced bases of lower dimensionality than global ones, with the same or higher accuracy. These ``light'' local bases should imply both faster evaluations when predicting new waveforms and faster data analysis, in particular faster statistical inference (the forward and inverse problems, respectively). In this approach, however, we have previously found important dependence on several hyperparameters, which do not appear in global reduced basis. This naturally leads to the problem of hyperparameter optimization (HPO), which is the subject of this paper. We tackle the problem through a Bayesian optimization, and show its superiority when compared to grid or random searches. We find that for gravitational waves from the collision of two spinning but non-precessing black holes, for the same accuracy, local hp-greedy reduced bases with HPO have a lower dimensionality of up to $4 \times$ for the cases here studied, depending on the desired accuracy. This factor should directly translate in a parameter estimation speedup, for instance. Such acceleration might help in the near real-time requirements for electromagnetic counterparts of gravitational waves from compact binary coalescences. In addition, we find that the Bayesian approach used in this paper for HPO is two orders of magnitude faster than, for example, a grid search, with about a $100 \times$ acceleration. The code developed for this project is available as open source from public repositories.


Automatic Evaluation of Attribution by Large Language Models

arXiv.org Artificial Intelligence

A recent focus of large language model (LLM) development, as exemplified by generative search engines, is to incorporate external references to generate and support its claims. However, evaluating the attribution, i.e., verifying whether the generated statement is fully supported by the cited reference, remains an open problem. Although human evaluation is common practice, it is costly and time-consuming. In this paper, we investigate the automatic evaluation of attribution given by LLMs. We begin by defining different types of attribution errors, and then explore two approaches for automatic evaluation: prompting LLMs and fine-tuning smaller LMs. The fine-tuning data is repurposed from related tasks such as question answering, fact-checking, natural language inference, and summarization. We manually curate a set of test examples covering 12 domains from a generative search engine, New Bing. Our results on this curated test set and simulated examples from existing benchmarks highlight both promising signals and challenges. We hope our problem formulation, testbeds, and findings will help lay the foundation for future studies on this important problem.


You Need to Update Google Chrome or Whatever Browser You Use

WIRED

China-linked hackers are increasingly moving beyond espionage and into the disturbing world of power grid attacks. Threat researchers at security software firm Symantec this week released new evidence that the Chinese hacking group known as APT41 infiltrated the power grid of an Asian nation. Some details of the latest intrusion echo a 2021 attack on India's power grid, suggesting the same hackers are responsible. In Argentina, a scandal is playing out over the use of facial recognition software in Buenos Aires. Despite laws that require authorities to limit searches to known fugitives, an investigation by a judge found that the system was used to look up people not wanted for any crimes.


The Twisted Eye in the Sky Over Buenos Aires

WIRED

This story was made possible with support from the Pulitzer Center's AI Accountability Network. "And then the nightmare began," says Guillermo Ibarrola, recalling his arrest at the crowded train station in the city center of Buenos Aires where we stand. He points to the cameras at the end of the tracks, then his finger pans to a door at the edge of the large station hall of the heritage-listed building. "That's where they kept me for six days." He slept on bare concrete, in a small cell.


ML Research Engineer at Intuition Machines - Buenos Aires, Buenos Aires, Argentina

#artificialintelligence

Intuition Machines uses AI/ML to build enterprise security products. We apply our research to systems that serve hundreds of millions of people, with a team distributed around the world. If you enjoy working at scale on both architecture and data, engineering our backend systems may be your ideal job. Our approach is simple: light specs, small teams, and rapid iteration. We are committed to building an inclusive and diverse global workforce.